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Abstract 

We show how solution concepts in games such as 
Nash equilibrium, correlated equilibrium, rational- 
izability, and sequential equilibrium can be given 
a uniform definition in terms of knowledge-based 
programs. Intuitively, all solution concepts are im- 
plementations of two knowledge-based programs, 
one appropriate for games represented in normal 
form, the other for games represented in extensive 
form. These knowledge-based programs can be 
viewed as embodying rationality. The representa- 
tion works even if (a) information sets do not cap- 
ture an agent's knowledge, (b) uncertainty is not 
represented by probability, or (c) the underlying 
game is not common knowledge. 

1 Introduction 

Game theorists represent games in two standard ways: in 
normal form, where each agent simply chooses a strategy, 
and in extensive form, using game trees, where the agents 
make choices over time. An extensive-form representation 
has the advantage that it describes the dynamic structure of 
the game — it explicitly represents the sequence of decision 
problems encountered by agents. However, the extensive- 
form representation purports to do more than just describe 
the structure of the game; it also attempts to represent the 
information that players have in the game, by the use of in- 
formation sets. Intuitively, an information set consists of a 
set of nodes in the game tree where a player has the same 
information. However, as Halpern [1997] has pointed out, 
information sets may not adequately represent a player's in- 
formation. 

Halpern makes this point by considering the following 
single-agent game of imperfect recall, originally presented by 
Piccione and Rubinstein [1997]: The game starts with nature 
moving either left or right, each with probability 1/2. The 
agent can then either stop the game (playing move S) and get 
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Figure 1: A game of imperfect recall. 

a payoff of 2, or continue, by playing move B. If he contin- 
ues, he gets a high payoff if he matches nature's move, and a 
low payoff otherwise. Although he originally knows nature's 
move, the information set that includes the nodes labeled £3 
and X4 is intended to indicate that the player forgets whether 
nature moved left or right after moving B. Intuitively, when 
he is at the information set X, the agent is not supposed to 
know whether he is at X3 or at x±. 

It is not hard to show that the strategy that maximizes ex- 
pected utility chooses action S at node x\, action B at node 
X2, and action R at the information set X consisting of x% 
and Xi. Call this strategy /. Let /' be the strategy of choos- 
ing action B at x%, action S at x-i, and L at X. Piccione and 
Rubinstein argue that if node x\ is reached, the player should 
reconsider, and decide to switch from / to /'. As Halpern 
points out, this is indeed true, provided that the player knows 
at each stage of the game what strategy he is currently using. 
However, in that case, if the player is using / at the infor- 
mation set, then he knows that he is at node X4, if he has 
switched and is using /', then he knows that he is at 23. So, 
in this setting, it is no longer the case that the player does not 
know whether he is at 23 or X4 in the information set; he can 
infer which state he is at from the strategy he is using. 

In game theory, a strategy is taken to be a function from in- 
formation sets to actions. The intuition behind this is that, 
since an agent cannot tell the nodes in an information set 



apart, he must do the same thing at all these nodes. But this 
example shows that if the agent has imperfect recall but can 
switch strategies, then he can arrange to do different things 
at different nodes in the same information set. As Halpern 
[1997] observes, '"situations that [an agent] cannot distin- 
guish" and "nodes in the same information set" may be two 
quite different notions.' He suggests using the game tree to 
describe the structure of the game, and using the runs and sys- 
tems framework [Fagin et ai, 1995] to describe the agent's 
information. The idea is that an agent has an internal local 
state that describes all the information that he has. A strat- 
egy (or protocol in the language of [Fagin et ai, 1995]) is a 
function from local states to actions. Protocols capture the 
intuition that what an agent does can depend only what he 
knows. But now an agent's knowledge is represented by its 
local state, not by an information set. Different assumptions 
about what agents know (for example, whether they know 
their current strategies) are captured by running the same pro- 
tocol in different contexts. If the information sets appropri- 
ately represent an agent's knowledge in a game, then we can 
identify local states with information sets. But, as the exam- 
ple above shows, we cannot do this in general. 

A number of solution concepts have been considered in the 
game-theory literature, ranging from Nash equilibrium and 
correlated equilibrium to refinements of Nash equilibrium 
such as sequential equilibrium and weaker notions such as 
rationalizability (see [Osborne and Rubinstein, 1994] for an 
overview). The fact that game trees represent both the game 
and the players' information has proved critical in defining 
solution concepts in extensive-form games. Can we still rep- 
resent solution concepts in a useful way using runs and sys- 
tems to represent a player's information? As we show here, 
not only can we do this, but we can do it in a way that gives 
deeper insight into solution concepts. Indeed, all the standard 
solution concepts in the literature can be understood as in- 
stances of a single knowledge-based (kb) program [Fagin et 
ai, 1995; 1997], which captures the underlying intuition that 
a player should make a best response, given her beliefs. The 
differences between solution concepts arise from running the 
kb program in different contexts. 

In a kb program, a player's actions depend explicitly on 
the player's knowledge. For example, a kb program could 
have a test that says "If you don't know that Ann received the 
information, then send her a message", which can be written 

if ^£>i(Ann received info) then send Ann a message. 

This kb program has the form of a standard if . . . then state- 
ment, except that the test in the if clause is a test on i's knowl- 
edge (expressed using the modal operator Bi for belief; see 
Section 2 for a discussion of the use of knowledge vs. belief). 

Using such tests for knowledge allows us to abstract away 
from low-level details of how the knowledge is obtained. Kb 
programs have been applied to a number of problems in the 
computer science literature (see [Fagin et ai, 1995] and the 
references therein). To see how they can be applied to under- 
stand equilibrium, given a game T in normal form, let Si (T) 
consist of all the pure strategies for player i in T. Roughly 
speaking, we want a kb program that says that if player i 
believes that she is about to perform strategy S (which we 



express with the formula doi(S)), and she believes that she 
would not do any better with another strategy, then she should 
indeed go ahead and run S. This test can be viewed as em- 
bodying rationality. There is a subtlety in expressing the 
statement "she would not do any better with another strat- 
egy". We express this by saying "if her expected utility, given 
that she will use strategy S, is x, then her expected utility if 
she were to use strategy S' is at most x." The "if she were 
to use S'" is a counter/actual statement. She is planning 
to use strategy S, but is contemplating what would happen 
if she were to do something counter to fact, namely, to use 
S'. Counterfactuals have been the subject of intense study 
in the philosophy literature (see, for example, [Lewis, 1973; 
Stalnaker, 1968]) and, more recently, in the game theory lit- 
erature (see, for example, [Aumann, 1995; Halpern, 2001; 
Samet, 1996]). We write the counterfactual "If A were the 
case then B would be true" as "A > B". Although this state- 
ment involves an "if . . . then", the semantics of the counter- 
factual implication A >z B is quite different from the material 
implication A =>• B. In particular, while A =>■ B is true if A 
is false, A^ B might not be. 

With this background, consider the following kb program 
for player i: 

for each strategy S G Si (T) do 
if Bi(dOi(S) A Vx(EU, = x => 

As'€S,(r)(<MS") h (EU, < a:)))) then S. 

This kb program is meant to capture the intuition above. In- 
tuitively, it says that if player i believes that she is about to 
perform strategy S and, if her expected utility is x, then if 
she were to perform another strategy S', then her expected 
utility would be no greater than x, then she should perform 
strategy S. Call this kb program EQNF r (with the indi- 
vidual instance for player i denoted by EQNFf ). As we 
show, if all players follow EQNF r , then they end up play- 
ing some type of equilibrium. Which type of equilibrium they 
play depends on the context. Due to space considerations, 
we focus on three examples in this abstract. If the players 
have a common prior on the joint strategies being used, and 
this common prior is such that players' beliefs are indepen- 
dent of the strategies they use, then they play a Nash equilib- 
rium. Without this independence assumption, we get a cor- 
related equilibrium. On the other hand, if players have pos- 
sibly different priors on the space of strategies, then this kb 
program defines rationalizable strategies [Bernheim, 1984; 
Pearce, 1984]. 

To deal with extensive-form games, we need a slightly dif- 
ferent kb program, since agents choose moves, not strategies. 
Let EQEFf be the following program, where a G PM de- 
notes that a is a move that is currently possible. 

for each move a G PM do 

if Bi(dOi(a) A Vx((EU, = x) => 

A a , ePM (do 4 (a')^(EU t <x))))thena. 

Just as EQNF r characterizes equilibria of a game T repre- 
sented in normal form, EQEF r characterizes equilibria of 
a game represented in extensive form. We give one example 
here: sequential equilibrium. To capture sequential equilib- 
rium, we need to assume that information sets do correctly 



describe an agent's knowledge. If we drop this assumption, 
however, we can distinguish between the two equilibria for 
the game described in Figure 1 . 

All these solution concepts are based on expected utility. 
But we can also consider solution concepts based on other 
decision rules. For example, Boutilier and Hyafil [2004] 
consider minimax-regret equilibria, where each player uses 
a strategy that is a best-response in a minimax-regret sense to 
the choices of the other players. Similarly, we can use max- 
imin equilibria [Aghassi and Bertsimas, 2006]. As pointed 
out by Chu and Halpern [2003], all these decision rules can 
be viewed as instances of a generalized notion of expected 
utility, where uncertainty is represented by a plausibility mea- 
sure, a generalization of a probability measure, utilities are 
elements of an arbitrary partially ordered space, and plausi- 
bilities and utilities are combined using © and (g>, generaliza- 
tions of + and x . We show in the full paper that, just by inter- 
preting "EUi = u" appropriately, we can capture these more 
exotic solution concepts as well. Moreover, we can capture 
solution concepts in games where the game itself is not com- 
mon knowledge, or where agents are not aware of all moves 
available, as discussed by Halpern and Rego [2006]. 

Our approach thus provides a powerful tool for represent- 
ing solution concepts, which works even if (a) information 
sets do not capture an agent's knowledge, (b) uncertainty is 
not represented by probability, or (c) the underlying game is 
not common knowledge. 

The rest of this paper is organized as follows. In Sec- 
tion 2, we review the relevant background on game theory 
and knowledge-based programs. In Section 3, we show that 
EQNF r and EQEF r characterize Nash equilibrium, cor- 
related equilibrium, rationalizability, and sequential equilib- 
rium in a game V in the appropriate contexts. We conclude 
in Section 4 with a discussion of how our results compare to 
other characterizations of solution concepts. 

2 Background 

In this section, we review the relevant background on games 
and knowledge-based programs. We describe only what we 
need for proving our results. The reader is encouraged to con- 
sult [Osborne and Rubinstein, 1994] for more on game the- 
ory, [Fagin et al, 1995; 1997] for more on knowledge-based 
programs without counterfactuals, and [Halpern and Moses, 
2004] for more on adding counterfactuals to knowledge- 
based programs. 

2.1 Games and Strategies 

A game in extensive form is described by a game tree. Asso- 
ciated with each non-leaf node or history is either a player — 
the player whose move it is at that node — or nature (which 
can make a randomized move). The nodes where a player i 
moves are further partitioned into information sets. With each 
run or maximal history h in the game tree and player i we can 
associate i's utility, denoted Ui(h), if that run is played. A 
strategy for player i is a (possibly randomized) function from 
i's information sets to actions. Thus a strategy for player i 
tells player i what to do at each node in the game tree where 
i is supposed to move. Intuitively, at all the nodes that player 



i cannot tell apart, player i must do the same thing. A joint 
strategy S = (Si , . . . , S n ) for the players determines a distri- 
bution over paths in the game tree. A normal-form game can 
be viewed as a special case of an extensive-form game where 
each player makes only one move, and all players move si- 
multaneously. 

2.2 Protocols, Systems, and Contexts 

To explain kb programs, we must first describe standard pro- 
tocols. We assume that, at any given point in time, a player in 
a game is in some local state. The local state could include 
the history of the game up to this point, the strategy being 
used by the player, and perhaps some other features of the 
player's type, such as beliefs about the strategies being used 
by other players. A global state is a tuple consisting of a local 
state for each player. 

A protocol for player i is a function from player i's lo- 
cal states to actions. For ease of exposition, we consider 
only deterministic protocols, although it is relatively straight- 
forward to model randomized protocols — corresponding to 
mixed strategies — as functions from local states to distribu- 
tions over actions. Although we restrict to deterministic pro- 
tocols, we deal with mixed strategies by considering distribu- 
tions over pure strategies. 

A run is a sequence of global states; formally, a run is 
a function from times to global states. Thus, r(m) is the 
global state in run r at time m. A point is a pair (r, to) 
consisting of a run r and time m. Let rj(m) be i's local 
state at the point (r, in); that is, if r(m) = (si, . . . , s n ), then 
rj(m) = Si. A joint protocol is an assignment of a protocol 
for each player; essentially, a joint protocol is a joint strat- 
egy. At each point, a joint protocol P performs a joint ac- 
tion (Pi(ri(m)), . . . , P n (r n (m))), which changes the global 
state. Thus, given an initial global state, a joint protocol P 
generates a (unique) run, which can be thought of as an ex- 
ecution of P. The runs in a normal-form game involve only 
one round and two time steps: time (the initial state) and 
time 1, after the joint strategy has been executed. (We as- 
sume that the payoff is then represented in the player's local 
state at time 1.) In an extensive-form game, a run is again 
characterized by the strategies used, but now the length of the 
run depends on the path of play. 

A probabilistic system is a tuple VS — (TZ, ft), where TZ is 
a set of runs and jl = (p\ , . . . , p, n ) associates a probablity p,i 
on the runs of TZ with each player i. Intuitively, fn represents 
player i's prior beliefs. In the special case where fii = ■ ■ ■ = 
fj, n = [i, the players have a common prior p on TZ. In this 
case, we write just (TZ, p). 

We are interested in the system corresponding to a joint 
protocol P. To determine this system, we need to describe the 
setting in which P is being executed. For our purposes, this 
setting can be modeled by a set Q of global states, a subset Qq 
of Q that describes the possible initial global states, a set A s 
of possible joint actions at each global state s, and n probabil- 
ity measures on Q , one for each player. Thus, a probabilistic 
context is a tuple 7 = (Q,Qo,{A s : s G Q},ff)} A joint 

1 We are implicitly assuming that the global state that results from 



protocol P is appropriate for such a context 7 if, for every 
global state s, the joint actions that P can generate are in A s . 
When P is appropriate for 7, we abuse notation slightly and 
refer to 7 by specifying only the pair (Go, ft). A protocol P 
and a context 7 for which P is appropriate generate a sys- 
tem; the system depends on the initial states and probability 
measures in 7. Since these are all that matter, we typically 
simplify the description of a context by omitting the set Q of 
global states and the sets A s of global actions. Let R(P, 7) 
denote the system generated by joint protocol P in context 7. 

If 7 = (Go, ft), then R(P, 7) = (1Z, ft'), where 1Z consists of 
a the run rg for each initial state s £ Go, where rg is the run 
generated by P when started in state s, and fi'^rg) = fii(s), 
for i = 1, . . . , n. 

A probabilistic system (TZ,ft') is compatible with a con- 
text 7 = (Go , ft) if (a) every initial state in Go is the initial 
state of some run in 1Z, (b) every run is the run of some pro- 
tocol appropriate for 7, and (c) if 1Z(s) is the set of runs in 
1Z with initial global state s, then fj,'j(7Z(s)) = f-ij(s), for 

j = 1, . . . , n. Clearly R(P, 7) is compatible with 7. 

We can think of the context as describing background in- 
formation. In distributed-systems applications, the context 
also typically includes information about message delivery. 
For example, it may determine whether all messages sent are 
received in one round, or whether they may take up to, say, 
five rounds. Moreover, when this is not obvious, the context 
specifies how actions transform the global state; for exam- 
ple, it describes what happens if in the same joint action two 
players attempt to modify the same memory cell. Since such 
issues do not arise in the games we consider, we ignore these 
facets of contexts here. For simplicity, we consider only con- 
texts where each initial state corresponds to a particular joint 
strategy of T. That is, is a set of local states for player i 
indexed by (pure) strategies. The set Sf can be viewed as de- 
scribing i's types; the state s$ can the thought of as the initial 
state where player i's type is such that he plays S (although 
we stress that this is only intuition; player i does not have to 
play S at the state s s ). Let Go = ^i x • • • x We will be 
interested in contexts where the set of initial global states is a 
subset Go °f Go - I n a normal-form game, the only actions pos- 
sible for player i at an initial global state amount to choosing 
a pure strategy, so the joint actions are joint strategies; no ac- 
tions are possible at later times. For an extensive-form game, 
the possible moves are described by the game tree. We say 
that a context for an extensive-form game is standard if the 
local states have the form (s, I), where s is the initial state 
and / is the current information set. In a standard context, 
an agent's knowledge is indeed described by the information 
set. However, we do not require a context to be standard. 
For example, if an agent is allowed to switch strategies, then 
the local state could include the history of strategies used. In 
such a context, the agent in the game of Figure 1 would know 
more than just what is in the information set, and would want 
to switch strategies. 

performing a joint action in As at the global state s is unique and ob- 
vious; otherwise, such information would also appear in the context, 
as in the general framework of [Fagin et at, 1995]. 



2.3 Knowledge-Based Programs 

A knowledge-based program is a syntactic object. For our 
purposes, we can take a knowledge-based program for player 
i to have the form 

if Ki then ai 
if K 2 then a2 
. . . , 

where each Kj is a Boolean combination of formulas of the 
form Bi<f, in which the ip's can have nested occurrences of Bg 
operators and counterfactual implications. We assume that 
the tests m, Ki, . . . are mutually exclusive and exhaustive, so 
that exactly one will evaluate to true in any given instance. 
The program EQNFf can be written in this form by simply 
replacing the for ... do statement by one line for each pure 
strategy in <S;(r); similarly for EQEFf . 

We want to associate a protocol with a kb program. Unfor- 
tunately, we cannot "execute" a kb program as we can a pro- 
tocol. How the kb program executes depends on the outcome 
of tests Kj. Since the tests involve beliefs and counterfactuals, 
we need to interpret them with respect to a system. The idea 
is that a kb program Pgj for player i and a probabilistic sys- 
tem PS together determine a protocol P for player i. Rather 
than giving the general definitions (which can be found in 
[Halpern and Moses, 2004]), we just show how they work in 
the two kb programs we consider in this paper: EQNF and 
EQEF. 

Given a system PS = (1Z, ft), we associate with each for- 
mula ip a set [<p]p,s of points in PS. Intuitively, [y]-p5 is the 
set of points of PS where the formula <p is true. We need a 
little notation: 

• If E is a set of points in PS, let 1Z(E) denote the set 
of runs going through points in E; that is 1Z(E) = {r : 
3m((r,m) £ E)}. 

• Let ICi (r, to) denote the set of points that i cannot dis- 
tinguish from (r, m): fCi(r,m) = {(r',m') : (r^(m') = 
?-j(m)}. Roughly speaking, /Cj(r, m) corresponds to i's 
information set at the point (r, m). 

• Given a point (r, m) and a player i, let /Li(j iT . ;m ) be the 
probability measure that results from conditioning // on 
K,i(r,m), i's information at (r,m). We cannot condi- 
tion on JCi (r, m) directly: \i % is a probability measure 
on runs, and fCi(r,m) is a set of points. So we actu- 
ally condition, not on /Cj(r, m), but on TZ(JCi(r, m)), the 
set of runs going through the points in K,i(r, m). Thus, 
Hi,r,m = y- % | TZ(JCi(r,m)). (For the purposes of this ab- 
stract, we do not specify ^i jI%m if /Ji(TZ(ICi(r, m))) = 0. 
It turns out not to be relevant to our discussion.) 

The kb programs we consider in this paper use a limited 
collection of formulas. We now can define [y]-p5 for the 
formulas we consider that do not involve counterfactuals. 

• In a system PS corresponding to a normal-form game 
r, if S £ S l (T), then [dOi(S)j rs is the set of initial 
points (r, 0) such that player i uses strategy S in run r. 

• Similarly, if PS corresponds to an extensive-form game, 
then [do^a)]^ is the set of points (r, to) of PS at 
which i performs action a. 



• Player i believes a formula p at a point (r, to) if the 
event corresponding to formula p has probability 1 ac- 
cording tO fli^ r ,m- That is, (r,m) G \Bnp\ vs if 
fii(TZ(K,i(r, to)) ^ (so that conditioning on /Cj(r, m) 
is defined) and fii, r , m (l<p}rs n /C,(r, m)) = 1. 

• With every run r in the systems we consider, we can as- 
sociate the joint (pure) strategy S used in r. 2 This pure 
strategy determines the history in the game, and thus de- 
termines player i's utility. Thus, we can associate with 
every point (r, to) player i's expected utility at (r, m), 
where the expectation is taken with respect to the prob- 
ability (J,i, r ,m- If u is a real number, then [EUj = uj-ps 
is the set of points where player i's expected utility is u; 
[EUi < uj-ps is defined similarly. 

• Assume that p(x) has no occurrences of V. Then 
fJx<p(x))}p S = n aeJ? [93[x/a]] P<s , where ip[x/a] is 
the result of replacing all occurrences of x in <p by a. 
That is, Vx is just universal quantification over x, where 
x ranges over the reals. This quantification arises for us 
when x represents a utility, so that Vxp(x) is saying that 
p holds for all choices of utility. 

We now give the semantics of formulas involving coun- 
terfactuals. Here we consider only a restricted class of such 
formulas, those where the counterfactual only occurs in the 
form dOj(S) h <p, which should be read as "if i were to use 
strategy S, then p would be true. Intuitively, do, (5) h <p is 
true at a point (r, to) if p> holds in a world that differs from 
(r, to) only in that i uses the strategy S. That is, dOj(5) >z p 
is true at (r, to) if p is true at the point (r',m) where, in 
run r', player i uses strategy S and all the other players 
use the same strategy that they do at (r, m). (This can be 
viewed as an instance of the general semantics for coun- 
terfactuals used in the philosophy literature [Lewis, 1973; 
Stalnaker, 1968] where ip >z p is taken to be true at a world 
to if p is true at all the worlds to' closest to to where is 
true.) Of course, if i actually uses strategy S in run r, then 
r' = r. Similarly, in an extensive-form game T, the closest 
point to (r, to) where dOj(a') is true (assuming that a' is an 
action that i can perform in the local state rj(m)) is the point 
(r', to) where all players other than player i use the same pro- 
tocol in r' and r, and i's protocol in r 1 agrees with i's protocol 
in r except at the local state rj(m), where i performs action 
a'. Thus, r' is the run that results from player i making a sin- 
gle deviation (to a' at time m) from the protocol she uses in 
r, and all other players use the same protocol as in r. 

There is a problem with this approach. There is no guar- 
antee that, in general, such a closest point (r',m) exists in 
the system VS. To deal with this problem, we restrict atten- 
tion to a class of systems where this point is guaranteed to 
exist. A system (1Z, jl) is complete with respect to context 7 
if 1Z includes every run generated by a protocol appropriate 
for context 7. In complete systems, the closest point (r', to) 
is guaranteed to exist. For the remainder of the paper, we 

2 If we allow players to change strategies during a run, then we 
will in general have different joint strategies at each point in a run. 
For our theorems in the next section, we restrict to contexts where 
players do not change strategies. 



evaluate formulas only with respect to complete systems. In 
a complete system VS, we define |doj(5) h p\vs to consist 
of all the points (r, to) such that the closest point (r' , to) to 
(r, to) where i uses strategy S is in [<^]-p,s- The definition 
of [doj(a) y <p>\vs is similar. We say that a complete sys- 
tem (R', pi) extends (1Z, jl) if fij and fi'- agree on 1Z (so that 
fi'j (A) = fij (A)) for all A C 11) for j = 1, . . . , n. 

Since each formula k that appears as a test in a kb program 
Pgj for player i is a Boolean combination of formulas of the 
form Bip, it is easy to check that if (r, to) G Mps, then 
/Ci(r, to) C [kJ-ps. In other words, the truth of k depends 
only on i's local state. Moreover, since the tests are mutually 
exclusive and exhaustive, exactly one of them holds in each 
local state. Given a system VS, we take the protocol Pg^ S 
to be such that Pgf s (£) = aj if, for some point (r, to) in VS 
with ri(m) — £, we have (r, m) e [^j]-p5. Since m, K2, ■ ■ ■ 
are mutually exclusive and exhaustive, there is exactly one 
action aj with this property. 

We are mainly interested in protocols that implement a kb 
program. Intuitively, a joint protocol P implements a kb 
program Pg in context 7 if P performs the same actions as 
Pg in all runs of P that have positive probability, assuming 
that the knowledge tests in Pg are interpreted with respect 
to the complete system VS extending R(P, 7). Formally, 
a joint protocol P (de facto) implements a joint kb program 
Pg [Halpern and Moses, 2004] in a context 7 = (Q Q , ft) if 
Pi (£) = Pg^ S (£) for every local state I = Ti (to) such that 
r G R(.P, 7) and \ii (r) 7^ 0, where VS is the complete sys- 
tem extending R(P, 7). We remark that, in general, there 
may not be any joint protocols that implement a kb program 
in a given context, there may be exactly one, or there may be 
more than one (see [Fagin et ah, 1995] for examples). This 
is somewhat analogous to the fact that there may not be any 
equilibrium of a game for some notions of equilibrium, there 
may be one, or there may be more than one. 

3 The Main Results 

Fix a game T in normal form. Let P^ be the protocol that, 
in initial state sg G Sf, chooses strategy S; let P"J — 
{P^ , . . . , P£f ) . Let STRAT, be the random variable on 
initial global states that associates with an initial global state s 
player i's strategy in r. As we said, Nash equilibrium arises 
in contexts with a common prior. Suppose that 7 = (Q , fi) is 
a context with a common prior. We say that /x is compatible 
with the mixed joint strategy S if [i is the probability on pure 
joint strategies induced by S (under the obvious identification 
of initial global states with joint strategies). 

Theorem 3.1: The joint strategy S is a Nash equilibrium of 
the game V iff there is a common prior probability measure 
p, on Go such that STRATi, . . . , STRAT„ are independent 
with respect to /1, /! is compatible with S, and P n ^ imple- 
ments EQNF r in the context (Qq , /j,). 

Proof: Suppose that S is a (possibly mixed strategy) Nash 
equilibrium of the game T. Let fig be the unique probability 



on Qq compatible with S. If S is played, then the proba- 
bility of a run where the pure joint strategy (Ti, . . . , T n ) is 
played is just the product of the probabilities assigned to 7j 
by S l7 so STRATi, . . . , STRAT„ are independent with re- 
spect to fxg. To see that P ni implements EQNF r in the 
context 7 = (Gq , p,g), let I — rj(0) be a local state such that 

r = R(P"/,7) and p,(r) ^ 0. If I = s T , then Pf ' (t) = T, 
so T must be in the support of Si. Thus, T must be a best 
response to S 1 -^ the joint strategy where each player j ^ i 
plays its component of S. Since i uses strategy T in r, the for- 
mula B i (do i (T')) holds at (r, 0) iff V = T. Moreover, since 
T is a best response, if u is i's expected utility with the joint 
strategy S, then for all T', the formula do 4 (T') h (EU, < u) 
holds at (r,0). Thus, (EQNFf ) 73 ' s (^) = T, where PS is 
the complete system extending R(P n/ ,7). It follows that 

P n f implements EQNF r . 

For the converse, suppose that /i is a common prior proba- 
bility measure on Gq, STRATi, . . . , STRAT„ are indepen- 
dent with respect to fi, fi is compatible with S, and P n f im- 
plements EQNF r in the context 7 = (Gq , /i). We want to 
show that S is a Nash equilibrium. It suffices to show that 
each pure strategy T in the support of Si is a best response 
to S-i. Since \i is compatible with S, there must be a run r 
such that ^(r) > Oandri(O) = st (i.e., player i chooses Tin 
run r). It since P n t implements EQNF r , and in the context 
7, EQNF r ensures that no deviation from T can improve 
i's expected utility with respect to S-i, it follows that T is 
indeed a best response. | 

As is well known, players can sometimes achieve better 
outcomes than a Nash equilibrium if they have access to a 
helpful mediator. Consider the simple 2-player game de- 
scribed in Figure 2, where Alice, the row player, must choose 
between top and bottom (T and B), while Bob, the column 
player, must choose between left and right (L and R): 



L R 



(3,3) 


(1,4) 


(4,1) 


(0,0) 



Figure 2: A simple 2-player game. 

It is not hard to check that the best Nash equilibrium for 
this game has Alice randomizing between T and B, and Bob 
randomizing between L and R; this gives each of them ex- 
pected utility 2. They can do better with a trusted mediator, 
who makes a recommendation by choosing at random be- 
tween (T,L), (T, R), and (B, L). This gives each of them 
expected utility 8/3. This is a correlated equilibrium since, 
for example, if the mediator chooses (T, L), and thus sends 
recommendation T to Alice and L to Bob, then Alice con- 
siders it equally likely that Bob was told L and R, and thus 
has no incentive to deviate; similarly, Bob has no incentive to 
deviate. In general, a distribution fj, over pure joint strategies 
is a correlated equilibrium if players cannot do better than 
following a mediator's recommendation if a mediator makes 



recommendations according to /1. (Note that, as in our ex- 
ample, if a mediator chooses a joint strategy (Si , . . . , S n ) ac- 
cording to p,, the mediator recommends Si to player i; player 
i is not told the joint strategy.) We omit the formal definition 
of correlated equilibrium (due to Aumman [1974]) here; how- 
ever, we stress that a correlated equilibrium is a distribution 
over (pure) joint strategies. We can easily capture correlated 
equilibrium using EQNF. 

Theorem 3.2 The distribution fi on joint strategies is a corre- 
lated equilibrium of the game T iff P 11 ? implements EQNF r 
in the context (Gq ,p). 

Both Nash equilibrium and correlated equilibrium require 
a common prior on runs. By dropping this assumption, we get 
another standard solution concept: rationalizability [Bern- 
heim, 1984; Pearce, 1984]. Intuitively, a strategy for player 
i is rationalizable if it is a best response to some beliefs that 
player i may have about the strategies that other players are 
following, assuming that these strategies are themselves best 
responses to beliefs that the other players have about strate- 
gies that other players are following, and so on. To make 
this precise, we need a little notation. Let S-i = Hj^iSj. 
Let Ui(S) denote player i's utility if the strategy tuple S is 
played. We describe player i's beliefs about what strategies 
the other players are using by a probability ^ on S-i. A 
strategy S for player i is a best response to beliefs described 
by a probability on 5_j(T) if X/feS u i(S, T)p,iiT) > 
Ef eS _, MS', T)/J,i(T) for all S' e S t . Following Osborne 
and Rubinstein [1994], we say that a strategy S for player i 
in game T is rationalizable if, for each player j, there is a 
set Zj C <Sj(r) and, for each strategy T e Zj, a probability 
measure [Ij.t on S-j(T) whose support is Z_j such that 

• S e Zi \ and 

• for each player j and strategy T G Zj, T is a best re- 
sponse to the beliefs /Uj,T- 

For ease of exposition, we consider only pure rationaliz- 
able strategies. This is essentially without loss of generality. 
It is easy to see that a mixed strategy S for player i is a best 
response to some beliefs ^ of player i iff each pure strategy 
in the support of S is a best response to /U,. Moreover, we 
can assume without loss of generality that the support of in 
consists of only pure joint strategies. 

Theorem 3.3: A pure strategy S for player i is rationalizable 
iff there exist probability measures fj,i, . . . , fj, n , a set Q C 
Qq, and a state s G Go such that P"^(sj) = S and P n f 
implements EQNF r in the context (Go, ft). 

Proof: First, suppose that P ni implements EQNF r in con- 
text (Go, ft)- We show that for each state s E Go an d player 
i, the strategy Sgi — P^(st) is rationalizable. Let Zi = 
{Sg,i : s G So}- For S G Zi, ME(S) = {s G Go : s t = s s }; 
that is, E(S) consists consists of all initial global states where 
player i's local state is S5; let /i^s = /Ltj(- | E(S)) (under the 
obvious identification of global states in Go with joint strate- 
gies). Since P n f implements EQNF r , it easily follows that 
S best response to ^^5. Hence, all the strategies in Zi are 
rationalizable, as desired. 



For the converse, let Zi consist of all the pure rationaliz- 
able strategies for player i. It follows from the definition of 
rationalizability that, for each strategy S E Zi, there exists 
a probability measure p,^g on Z-i such that S is a best re- 
sponse to Hi t s- For a set Z of strategies, we denote by Z the 
set {st ■ T e Z}. Set Go = Z\ x . . . x Z n , and choose some 
measure /ii on Go such that /ii(- | E(S)) = /Uj ; s for all S G 
Zj. (We can take ^ = X^seZ; a s^i,S, where as £ (0, 1) 

and EseZi a s = L ) Reca H that p ? f ( s s) = S for all states 
sg. It immediately follows that, for every rationalizable joint 

strategy S = (Si, . . . , S n ), both s = (s Sl , ■ . ■ , s Sn ) £ Go, 
and S = P n f(s). Since the states in Go all correspond to 
rationalizable strategies, and by definition of rationalizability 
each (individual) strategy Si is a best response to /U^s, it is 

easy to check that P ni implements EQNF r in the context 
(Go, fl), as desired. | 

We remark that Osborne and Rubinstein's definition of ra- 
tionalizability allows Hj^T to be such that j believes that other 
players' strategy choices are correlated. In most of the lit- 
erature, players are assumed to believe that other players' 
choices are made independently. If we add that requirement, 
then we must impose the same requirement on the probability 
measures p,\, . . . , p, n m Theorem 3.3. 

Up to now we have considered solution concepts for games 
in normal form. Perhaps the best-known solution concept for 
games in extensive form is sequential equilibrium [Kreps and 
Wilson, 1982]. Roughly speaking, a joint strategy S is a se- 
quential equilibrium if Si is a best response to S_j at all infor- 
mation sets, not just the information sets that are reached with 
positive probability when playing S. To understand how se- 
quential equilibrium differs from Nash equilibrium, consider 
the game shown in Figure 3. 



across^ across,? 
A. B 



* 

(U) 



(2,3) 



Figure 3: A game with an unreasonable Nash equilibrium. 

One Nash equilibrium of this game has A playing down^ 
and B playing across^. However, this is not a sequential 
equilibrium, since playing across is not a best response for 
B if B is called on to play. This is not a problem in a 
Nash equilibrium because the node where B plays is not 
reached in the equilibrium. Sequential equilibrium refines 
Nash equilibrium (in the sense that every sequential equilib- 
rium is a Nash equilibrium) and does not allow solutions such 
as (down^, across^). Intuitively, in a sequential equilibrium, 
every player must make a best response at every information 
set (even if it is reached with probability 0). In the game 



shown in Figure 3, the unique joint strategy in a sequential 
equilibrium has A choosing across^ and B choosing downs . 

The main difficulty in defining sequential equilibrium lies 
in capturing the intuition of best response in information sets 
that are reached with probability 0. To deal with this, a se- 
quential equilibrium is defined to be a pair (S, (3), consist- 
ing of a joint strategy S and a belief system (3, which as- 
sociates with every information set I a probability (3(1) on 
the histories in /. There are a number of somewhat subtle 
consistency conditions on these pairs pairs; we omit them 
here due to lack of space (see [Kreps and Wilson, 1982; 
Osborne and Rubinstein, 1994] for details). Our result de- 
pends on a recent characterization of sequential equilibrium 
[Halpern, 2006] that uses nonstandard probabilities, which 
can assign infinitesimal probabilities to histories. By assum- 
ing that every history gets positive (although possibly in- 
finitesimal) probability, we can avoid the problem of dealing 
with information sets that are reached with probaility 0. 

To every nonstandard real number r, there is a closest stan- 
dard real number denoted st (r), and read "the standard part 
of r": \r — st(r) | is an infinitesimal. Given a nonstandard 
probability measure v, we can define the standard probabil- 
ity measure if (v) by taking st (u) (w) = st (v(w)). A non- 
standard probability v on Go is compatible with joint strategy 
S if st (v) is the probability on pure strategies induced by 
S. When dealing with nonstandard probabilities, we general- 
ize the definition of implementation by requiring only that P 
performs the same actions as Pg in runs r of P such that 
if (y) (r) > 0. Moreover, the expression "EUi = x" in 
EQEF r is interpreted as "the standard part of i's expected 
utility is x" (since x ranges over the standard real numbers). 

Theorem 3.4 IfY is a game with perfect recall there is a be- 
lief system (3 such that (S, (3) is a sequential equilibrium ofT 
iff there is a common prior nonstandard probability measure 
v on Go mat gives positive measure to all states such that 
STRATi, . . . , STRAT„ are independent with respect to v, 
v is compatible with S, and P e ^ implements EQEF r in the 
standard context (Go , v). 

This is very similar in spirit to Theorem 3.1. The key dif- 
ference is the use of a nonstandard probability measure. Intu- 
itively, this forces S to be a best response even at information 
sets that are reached with (standard) probability 0. 

The effect of interpreting "EUi = x" as "the standard part 
of i's expected utility is x" is that we ignore infinitesimal dif- 
ferences. Thus, for example, the strategy Pf(so) might not 
be a best response to S-i\ it might just be an e-best response 
for some infinitesimal e. As we show in the full paper, it fol- 
lows from Halpern's [2006] results that we can also obtain a 
characterization of ( trembling hand) perfect equilibrium [Sel- 
ten, 1975], another standard refinement of Nash equilibrium, 
if we interpret "EUi = x" as "the expected utility for agent i 
is x" and allow x to range over the nonstandard reals instead 
of just the standard reals. 

3 These are games where players remember all actions made and 
the states they have gone through; we give a formal definition in the 
full paper. See also [Osborne and Rubinstein, 1994], 



4 Conclusions 

We have shown how a number of different solution con- 
cepts from game theory can be captured by essentially one 
knowledge-based program, which comes in two variants: one 
appropriate for normal-form games and one for extensive- 
form games. The differences between these solution concepts 
is captured by changes in the context in which the games are 
played: whether players have a common prior (for Nash equi- 
librium, correlated equilibrium, and sequential equilibrium) 
or not (for rationalizability), whether strategies are chosen 
independently (for Nash equilibrium, sequential equilibrium, 
and rationalizability) or not (for correlated equilibrium); and 
whether uncertainty is represented using a standard or non- 
standard probability measure. 

Our results can be viewed as showing that each of these so- 
lution concepts sc can be characterized in terms of common 
knowledge of rationality (since the kb programs EQNF r 
and EQEF r embody rationality, and we are interested in 
systems "generated" by these program, so that rationality 
holds at all states), and common knowledge of some other 
features X sc captured by the context appropriate for sc (e.g., 
that strategies are chosen independently or that the prior). 
Roughly speaking, our results say that if X sc is common 
knowledge in a system, then common knowledge of rational- 
ity implies that the strategies used must satisfy solution con- 
cept sc; conversely, if a joint strategy S satisfies sc, then there 
is a system where X sc is common knowledge, rationality is 
common knowledge, and S is being played at some state. Re- 
sults similar in spirit have been proved for rationalizability 
[Brandenburger and Dekel, 187] and correlated equilibrium 
[Aumann, 1987]. Our approach allows us to unify and ex- 
tend these results and, as suggested in the introduction, ap- 
plies even to settings where the game is not common knowl- 
edge and in settings where uncertainty is not represented by 
probability. We believe that the approach captures the essence 
of the intuition that a solution concept should embody com- 
mon knowledge of rationality. 
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